A heuristic algorithm for hierarchical representation of form documents

نویسندگان

  • Pinar Duygulu Sahin
  • Volkan Atalay
  • Ebru Dincel
چکیده

In this paper, our aim is to develop a logical representation for form documents. We propose a hierarchical structure to represent the logical layout of a form by using lines. The approach is top-down and no domain knowledge such as the preprinted data or filled-in data is used. Logically same but physically different forms are associated to the same hierarchical tree. This representation can handle geometrical modifications and slight variations. 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

A New Heuristic Algorithm for Drawing Binary Trees within Arbitrary Polygons Based on Center of Gravity

Graphs have enormous usage in software engineering, network and electrical engineering. In fact graphs drawing is a geometrically representation of information. Among graphs, trees are concentrated because of their ability in hierarchical extension as well as processing VLSI circuit. Many algorithms have been proposed for drawing binary trees within polygons. However these algorithms generate b...

متن کامل

A Meta Heuristic Method Representation for Selection of Alternative Energy Production Methods

In the science of operation research and decision theory, selection is the most important process. Selection is a process that studies multiple qualitative and quantitative criteria, related to the science of management, which are mostly incompatible with each other. The multi criteria selection of a renewable energy portfolio is one of the main issues considered in multi criteria literatur...

متن کامل

A Meta Heuristic Method Representation for Selection of Alternative Energy Production Methods

In the science of operation research and decision theory, selection is the most important process. Selection is a process that studies multiple qualitative and quantitative criteria, related to the science of management, which are mostly incompatible with each other. The multi criteria selection of a renewable energy portfolio is one of the main issues considered in multi criteria literatur...

متن کامل

Form Document Representation and Identiication ?

A form processing system aims to extract meaningful data from a form document for ooce automation 1, 2, 3]. The main interest is to extract user lled-in data which is considered to be meaningful. However, in order to perform such a task, the structure of the form should be known in advance. The form structure can be obtained by processing a blank model form on which no user lled-in data exists....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998